计量经济学和机器学习中的各种问题,包括仪器变量回归和钟声残留最小化,可以表达为满足一组条件矩限制(CMR)。我们得出了满足CMR的一般游戏理论策略,该策略可扩展到非线性问题,可与基于梯度的优化相提并论,并且能够考虑有限的样本不确定性。我们恢复了Dikkala等人的方法。和Dai等。作为我们一般框架的特殊情况,请先详细介绍各种扩展,以及如何有效地解决CMR定义的游戏。
translated by 谷歌翻译
我们考虑模仿学习问题,在这些问题中,专家可以在演示时间和测试时间内访问学习者隐藏的每个集合上下文。尽管学习者可能无法通过考虑整个国家和行动的历史来早期在情节中准确地重现专家行为,但他们可能最终能够识别上下文并像专家一样行事。我们证明,与非政策的方法相比,在政策模仿学习算法(有或不访问可查询的专家)都可以更好地处理这些渐近性问题,并且能够避免闩锁行为(对过去的动作的天真重复)这困扰着后者。我们在玩具匪徒域中进行实验,该实验表明,与统一的policy方法的均匀性能相比,非政策方法是否能够渐近地匹配专家的性能。我们证明,在几个连续的控制任务上,政策方法能够使用历史记录来识别上下文,而在访问历史记录时,违反政策方法实际上表现较差。
translated by 谷歌翻译
在线模仿学习是如何最好地访问环境或准确的模拟器的问题的问题。先前的工作表明,在无限的样本制度中,匹配的确切力矩达到了与专家政策的价值等效性。但是,在有限的样本制度中,即使没有优化错误,经验差异也会导致性能差距,该差距以$ h^2 / n $的行为克隆缩放,在线时刻$ h / \ sqrt {n} $匹配,其中$ h $是地平线,$ n $是专家数据集的大小。我们介绍了重播估算的技术以减少这种经验差异:通过反复在随机模拟器中执行缓存的专家动作,我们计算了一个更平滑的专家访问分布估算以匹配的。在存在一般函数近似的情况下,我们证明了一个元定理,可以减少离线分类参数估计误差的方法差距(即学习专家策略)。在表格设置或使用线性函数近似中,我们的元定理表明,我们方法产生的性能差距达到了最佳$ \ widetilde {o} \ left(\ min(\ min({h^h^{3/2}}}} / {n} ,{h} / {\ sqrt {n}} \ right)$依赖关系,在与先前的工作相比明显弱的假设下。我们在多个连续的控制任务上实施了多个方法的多次实例化,并发现我们能够显着提高策略绩效跨各种数据集尺寸。
translated by 谷歌翻译
Deep learning models for learning analytics have become increasingly popular over the last few years; however, these approaches are still not widely adopted in real-world settings, likely due to a lack of trust and transparency. In this paper, we tackle this issue by implementing explainable AI methods for black-box neural networks. This work focuses on the context of online and blended learning and the use case of student success prediction models. We use a pairwise study design, enabling us to investigate controlled differences between pairs of courses. Our analyses cover five course pairs that differ in one educationally relevant aspect and two popular instance-based explainable AI methods (LIME and SHAP). We quantitatively compare the distances between the explanations across courses and methods. We then validate the explanations of LIME and SHAP with 26 semi-structured interviews of university-level educators regarding which features they believe contribute most to student success, which explanations they trust most, and how they could transform these insights into actionable course design decisions. Our results show that quantitatively, explainers significantly disagree with each other about what is important, and qualitatively, experts themselves do not agree on which explanations are most trustworthy. All code, extended results, and the interview protocol are provided at https://github.com/epfl-ml4ed/trusting-explainers.
translated by 谷歌翻译
Time series is the most prevalent form of input data for educational prediction tasks. The vast majority of research using time series data focuses on hand-crafted features, designed by experts for predictive performance and interpretability. However, extracting these features is labor-intensive for humans and computers. In this paper, we propose an approach that utilizes irregular multivariate time series modeling with graph neural networks to achieve comparable or better accuracy with raw time series clickstreams in comparison to hand-crafted features. Furthermore, we extend concept activation vectors for interpretability in raw time series models. We analyze these advances in the education domain, addressing the task of early student performance prediction for downstream targeted interventions and instructional support. Our experimental analysis on 23 MOOCs with millions of combined interactions over six behavioral dimensions show that models designed with our approach can (i) beat state-of-the-art educational time series baselines with no feature extraction and (ii) provide interpretable insights for personalized interventions. Source code: https://github.com/epfl-ml4ed/ripple/.
translated by 谷歌翻译
Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate. This paper continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents. We perform an extensive evaluation of deep-learning techniques for task-oriented parsing on this dataset, including different flavors of seq2seq systems and RNNGs. The dataset comes in two main versions, one in a recently introduced utterance-level hierarchical notation that we call TOP, and one whose targets are executable representations (EXR). We demonstrate empirically that training the parser to directly generate EXR notation not only solves the problem of entity resolution in one fell swoop and overcomes a number of expressive limitations of TOP notation, but also results in significantly greater parsing accuracy.
translated by 谷歌翻译
Finding an initial noise vector that produces an input image when fed into the diffusion process (known as inversion) is an important problem in denoising diffusion models (DDMs), with applications for real image editing. The state-of-the-art approach for real image editing with inversion uses denoising diffusion implicit models (DDIMs) to deterministically noise the image to the intermediate state along the path that the denoising would follow given the original conditioning. However, DDIM inversion for real images is unstable as it relies on local linearization assumptions, which result in the propagation of errors, leading to incorrect image reconstruction and loss of content. To alleviate these problems, we propose Exact Diffusion Inversion via Coupled Transformations (EDICT), an inversion method that draws inspiration from affine coupling layers. EDICT enables mathematically exact inversion of real and model-generated images by maintaining two coupled noise vectors which are used to invert each other in an alternating fashion. Using Stable Diffusion, a state-of-the-art latent diffusion model, we demonstrate that EDICT successfully reconstructs real images with high fidelity. On complex image datasets like MS-COCO, EDICT reconstruction significantly outperforms DDIM, improving the mean square error of reconstruction by a factor of two. Using noise vectors inverted from real images, EDICT enables a wide range of image edits--from local and global semantic edits to image stylization--while maintaining fidelity to the original image structure. EDICT requires no model training/finetuning, prompt tuning, or extra data and can be combined with any pretrained DDM. Code is available at https://github.com/salesforce/EDICT.
translated by 谷歌翻译
Deep learning based text-to-speech (TTS) systems have been evolving rapidly with advances in model architectures, training methodologies, and generalization across speakers and languages. However, these advances have not been thoroughly investigated for Indian language speech synthesis. Such investigation is computationally expensive given the number and diversity of Indian languages, relatively lower resource availability, and the diverse set of advances in neural TTS that remain untested. In this paper, we evaluate the choice of acoustic models, vocoders, supplementary loss functions, training schedules, and speaker and language diversity for Dravidian and Indo-Aryan languages. Based on this, we identify monolingual models with FastPitch and HiFi-GAN V1, trained jointly on male and female speakers to perform the best. With this setup, we train and evaluate TTS models for 13 languages and find our models to significantly improve upon existing models in all languages as measured by mean opinion scores. We open-source all models on the Bhashini platform.
translated by 谷歌翻译
自然语言处理(NLP)已越来越多地用于提供教育应用的适应性。但是,最近的研究突出了预训练的语言模型中的各种偏见。尽管现有研究调查了不同领域的偏见,但它们在解决有关教育和多语言语料库的细粒度分析方面受到限制。在这项工作中,我们通过在五年内从大学生收集的9,165个德国同行评审的语料库中分析了跨文本和多个架构的偏见。值得注意的是,我们的语料库包括来自同行评审接收者以及人口统计属性的帮助,质量和关键方面等级等标签。我们对(1)与聚类标签有关的(2)最常见的预训练的德语模型(T5,BERT和GPT-2)和Glove Embeddings进行了单词嵌入关联测试(WEAT)测试(WEAT)分析(1)我们收集的语料库,以及(3)对我们收集的数据集进行微调后的语言模型。与我们的最初期望相反,我们发现我们收集的语料库在共同出现分析或手套嵌入中没有揭示许多偏见。但是,预先训练的德语模型发现了实质性的概念,种族和性别偏见,并且在同行评审数据的微调过程中,概念和种族轴之间的偏见发生了重大变化。通过我们的研究,我们的目标是通过新颖的数据集,对自然语言教育数据的偏见的理解以及不抵消语言模型中的教育任务偏见的潜在危害,为第四联合国的可持续发展目标(质量教育)做出贡献。
translated by 谷歌翻译
该软件随着先进技术和方法论的发明而迅速变化。响应不断变化的业务需求而快速,成功升级软件的能力比以往任何时候都重要。对于软件产品的长期管理,测量软件可维护性至关重要。通过提供软件可维护性的准确预测,将软计算技术用于软件可维护性预测,在软件维护过程中表现出了巨大的希望。为了更好地了解软计算技术在软件可维护性预测中的作用,我们旨在为软件可维护性预测提供对软计算技术的系统文献综述。首先,我们提供了软件可维护性的详细概述。之后,我们探讨了软件可维护性的基本原理以及采用软计算方法来预测软件可维护性的原因。后来,我们检查了软件可维护预测过程中采用的软计算方法。此外,我们讨论了与使用软计算技术预测软件可维护性相关的困难和潜在解决方案。最后,我们以一些有希望的未来方向来结束审查,以推动这一有前途的领域的进一步研究创新和发展。
translated by 谷歌翻译